## MetaGFN: Exploring Distant Modes with Adapted Metadynamics for Continuous GFlowNets
Please cite this paper if you use this code in your research.

## Description
This project provides a framework for training continuous GFlowNets with Metadynamics exploration. 
It includes all code necessary to replicate the experiments in the paper.

## Features

- Two continuous environments: LineEnvironment and AlanineDipeptideEnvironment
- GFlowNet losses: Detailed Balance, Trajectory Balance, Subtrajectory Balance
- Exploration strategies: on-policy, noisy exploration, thompson sampling, metadynamics, epsilon noisy exploration
- Adaptable to new environments: implement the environment interface and the metadynamics interface
- Realtime visalisation during training: adapt and run the cells in metad_gfns.ipynb

## Installation

To install and set up the project, follow these steps:

1. Clone the repository: `git clone https://github.com/dominicp6/metad_contgfn`

2. Install the required dependencies: `pip install -r requirements.txt`

## Configuration Parameters

Preset configurations are available in the `config` folder.

### Usage

To run an experiment, run the following command from the root directory of the project:

```bash
python ./src/exp_script.py --config ./src/config/line_environment.json --exp_name line_environment_test
```

To run with different parameters than the default configuation, either change the parameters in the config file or pass them as arguments, for example to use a detailed balance loss function instead of the default trajectory balance loss function, run the following command:

```bash
python ./src/exp_script.py --config ./src/config/line_environment.json --exp_name line_environment_DB_test --loss DB 
```

Alternatively, you can run the experiments in the Jupyter notebook `metad_gfns.ipynb` and visualise the training in realtime.
Simply run the configuration cell with the desired parameters, then run the visualisation cell, then run the training cell.

Explanation of config parameters:
```
{
    "exp_name": Name of the experiment (str),
    "master_dir": The directory where the results will be saved (str),
    "plot": Should be false if not running in a Jupyter notebook (bool),
    "freq_plot": How often to plot the results when running in a Jupyter notebook (int),
    "n_iterations": The total number of iterations to run the experiment for (int),
    "batch_size": The batch size for training the neural network (int),
    "device": Device to run the experiment on (str; "cpu" or "cuda"),
    "seed": Seed for reproducibility (int; default 4444),
    "freq_md": How often to run metadynamics batches during training (int; default every 10 batches),
    "freq_rb": How often to run replay buffer batches during training (int; default every 2 batches),
    "repeats": Number of times to repeat the experiment. Different seeds will be used for each repeat and the results will be run in parallel on different threads (int),
    "n_threads": Number of threads to use for parallel experiments (int; default -1 = all available),
    "data_saving": {
        "on_policy_samples": Number of on-policy samples to use when computing e.g. L1_policy_error,
        "freq_data_save": How often to save the data (int; default every 50 batches),
        "loss": Whether or not to save the loss data (bool),
        "logZ": Whether or not to save the logZ data (bool),
        "L1_policy_error": Whether or not to save the L1 policy error data (bool),
        "L1_potential_kde_error": Whether or not to save the L1 potential KDE error data (bool),
    },
    "env": {
        "env_name": The name of the environment to use (str; "line" or "alanine_dipeptide"),
        "mixture_dimension": Mixture dimension for the environment (int),
        "max_log_concentration": Maximum log concentration for the environment. Only needed for alanine environment (float),
        "min_log_concentration": Minimum log concentration for the environment. Only needed for alanine environment (float),
        "max_policy_std": Maximum sigma for the environment. Only needed for line environment (float),
        "min_policy_std": Minimum sigma for the environment. Only needed for line environment (float),
        "lower_bound": Lower bound for the environment (float),
        "upper_bound": Upper bound for the environment (float),
    },
    "gfn": {
        "optimizer": The optimizer to use for training the neural network (str; "adam" or "sgd"),
        "gradient_clipping": Whether or not to clip the gradients (bool),
        "clip_value": The value to clip the gradients to (float),
        "trajectory_length": The length of the trajectory to use for training the neural network (int),
        "hidden_dim": The hidden dimension of the neural network (int),
        "n_hidden_layers": The number of hidden layers in the neural network (int),
        "lr_model": The initial learning rate for the neural network (float),
        "lr_logz": The initial learning rate for the logZ (float),
        "lr_schedule": The learning rate schedule for the neural network (str; default "linear"),
        "loss": The loss function to use for training the neural network (str; "TB" or "DB" or "STB"),
        "lambda": The lambda parameter for the STB loss function (float; default 0.9),
        "regularization": Whether or not to use regularization (bool),
        "regularization_const": The regularization constant (float),
        "tie_weights": Whether or not to tie the weights of the torsos of the neural network across the different models for logPF, logPB etc. (bool),
        "log_reward_clip_min": The minimum value to clip the log reward (float),
        "thompson_sampling": Whether or not to use Thompson sampling (bool),
        "thompson_sampling_num_heads": The number of heads to use for Thompson sampling (int),
        "noise_exploration": {
            "active": Whether or not to use noisy exploration (bool),
            "initial_noise": The initial noise for the noisy exploration (float),
            "final_noise": The final noise for the noisy exploration (float),
            "noise_profile": The profile of the noise for the noisy exploration (str; default "exponential_flat"),
        },
        "epsilon_noisy": {
            "active": Whether or not to use epsilon noisy exploration (bool),
            "epsilon": The epsilon value for epsilon noisy exploration (float),
        }
    },
    "metad": {
        "active": Whether or not to use metadynamics (bool),
        "train": Whether or not to train the metadynamics model (bool),
        "delta_t": The delta t for the metadynamics model (float),
        "n": The number inner iterations in metadynamics per outer iteration (int),
        "beta": The inverse temperature parameter for the metadynamics model (float),
        "gamma": The friction parameter for the metadynamics model (float),
        "w": The deposition height for the bias potential (float),
        "epsilon": The epsilon parameter for the metadynamics model (float),
        "mu0": Vector of means for the initial Gaussian position distribution (list),
        "var0": Vector of variances for the initial position Gaussian distribution (list),
        "mu_p0": Vector of means for the initial Gaussian momentum distribution for the bias potential (list),
        "var_p0": Vector of variances for the initial momentum Gaussian distribution for the bias potential (list),
    },
    "replay_buffer": {
        "sampling_method": The sampling method for the replay buffer (str; default "biased),
        "reward_only": Whether or not to use reward only sampling (bool). True means that trajectories are regenerated on the fly when retrieved from the replay buffer,
        "active": Whether or not to use a replay buffer (bool),
        "capacity": The capacity of the replay buffer (int),
        "alpha": The alpha parameter for the replay buffer for bias sampling (float),
        "beta": The beta parameter for the replay buffer for bias sampling (float),
        "log_reward_threshold": The threshold for the log reward for the replay buffer (float),
        "grid_spacing": The grid spacing for the replay buffer (float),
    }
}
```

## Files

The project is structured as follows:

- `config/`: Contains the configuration files for the experiments
- `data/`: Contains data needed for the experiments
- `src/`: Contains the source code for the project
  - `gfn/`: Contains the GFlowNet implementations
  - `envs/`: Contains the loss implementations
  - metad_gfns.ipynb: Notebook for running interactive experiments
  - networks.py: Contains the neural network architectures
  - metadynamics.py: Contains the metadynamics implementation
  - noise_schedule.py: Various noise scheduling strategies (for noisy exploration)
  - train.py: Contains the training loop
  - replay_buffer.py: Contains the replay buffer implementation
  - exp_script.py: Script for running experiments
